
Cornbread Muffin
The Chosen - Holy Warriors of Bob the Unforgiving
1
|
Posted - 2016.03.26 08:49:33 -
[2] - Quote
Pete Butcher wrote:That further confirms no one really thought about Eve use cases. I take it as a (potential) good thing. Not that having a Dust-appropriate API means they can't also have an EVE-appropriate API, but at least there's some operational reason for things to be how they are. That's a superior alternative to "this is how we think an EVE-appropriate API should function". 
Reasonably up-to-date snapshots would be sufficient for many uses and a bulk data endpoint could exist along side the existing resources. File size and the bandwidth they want to support is a legit issue, but bandwidth usage could be kept the same and service levels massively improved with a bulk data endpoint.
The last snapshot I was able to get covered the 23 empire regions and 10375 items (I took out stuff like Dust items from my requests, but that's most of the items in the game). That's 477,250 calls to the API. The response headers alone were 293MB and the payload (compressed on the wire) was 97.3MB. It took a little under 2 hours, so transferred at a ~.5Mbps overall. This was at ~70 rq/s, so roughly half the rate limit. I assume that's the bulk of the orders that exist in the entire game, but I've never requested null-sec markets so I don't know that for sure.
The same data gzipped as one file is 56.1MB. 10.5MB of that is The Forge. If I was adding a bulk data endpoint I'd pull out a lot of the redundant data (e.g. return no data at all for itemTypeIds that have no orders, remove most of the bulky urls), but even if the data was identical it would still be about 1/8th the size of what they are sending me now, most of which is wasted on headers. I could get the same market data at the same .5Mbps every ~15 minutes and it wouldn't take any more data transfer on their part. That's a worthwhile improvement IMO. |

Cornbread Muffin
The Chosen - Holy Warriors of Bob the Unforgiving
2
|
Posted - 2016.03.27 04:39:21 -
[3] - Quote
William Kugisa wrote:What about the idea of just having a simplified transaction/change log?The existing API's, or once 3 months of logs are available would give anyone the initial data to then update from, so a CCP provided snapshot is not needed. That would also give live / near live data, without crawling or creating snapshots of the entire market at a comparable frequency. Ideally the cache time for this could be in seconds, rather than minutes or hours.
I think "snapshots" alone are less interesting if they are only once a day / every few hours, the "liveness" of such data would be comparable to EVE-Central, or crawling the current endpoint anyway. Also third party developers can create there own snapshot API's easily then.
Diff updates would be brilliant as well. I agree that slow snapshots are worthless since people will just use the "real time" (5 minute cache) endpoints instead. Snapshots will only be used if they can get you fresher bulk data than hitting all endpoints individually can. Fortunately, this should be easy.
Based on the findings I put up in my previous post, I would think they could provide regional snapshots every 15 minutes with no issues. Even faster (or the same speed but with less bandwidth requirements) if they prune the data down to a minimal format.
The current cache is 5 minutes, so I'll use that as the definition of "real time". For the data I retrieved it breaks down like this:
Name Size (gzip+headers) bandwidth needed @ 5min updates ---------------------------------------------------------------------------------------- Current API (current format) 390.3 MB 10.4 Mbps Bulk Endpoint (current format) 56.1 MB 1.5 Mbps Bulk Endpoint (slim format) 7.7 MB 0.2 Mbps
The slim format looks like this:
{ "regions": [ { "id": "10000002", "lastUpdate: "2016-01-28T14:56:39", "orderFields": ['orderId', 'itemTypeId', 'issued', 'stationId', 'volumeEntered', 'volume', 'minVolume', 'range', 'price', 'duration', 'buy'], "orders": [ [911346113, 0, "2016-03-02 06:40:50", 60006484, 22778, 22778, 1, "region", 14.00, 365, true], ... ] }, ... ] }
They already transfer to me at 0.5Mbps, I just get updates every few hours instead of every 5 minutes. Sending the same dataset 60x more often while still using 60% less bandwidth seems like a no-brainer. If bandwidth is still a big deal you could have bulk endpoints use protocol buffers or some other binary format. I just went with the simplest possible update from the current API. |